SELECT statement
Back to DuckDB Data Engineering Glossary
Overview
The SELECT
statement is the workhorse of SQL, used to retrieve and transform data from tables, views, or other data sources. It forms the foundation of data analysis by allowing you to specify exactly which data you want to examine and how you want it structured.
Basic Syntax
The simplest form retrieves all columns from a table:
SELECT * FROM employees;
Or specific columns:
SELECT first_name, last_name, salary FROM employees;
DuckDB-Specific Features
DuckDB extends the standard SELECT
syntax with several powerful features. You can omit the SELECT
keyword entirely when starting with FROM
:
FROM employees;
DuckDB also offers enhanced column selection with EXCLUDE
and REPLACE
:
SELECT * EXCLUDE (password, api_key) FROM users;
SELECT * REPLACE (price / 100 AS price_dollars) FROM orders;
Common Clauses
The SELECT
statement supports many clauses that modify its behavior:
-
WHERE
filters rows:SELECT * FROM orders WHERE total > 100;
-
GROUP BY
aggregates data:SELECT department, AVG(salary) FROM employees GROUP BY department;
-
ORDER BY
sorts results:SELECT * FROM products ORDER BY price DESC;
-
LIMIT
restricts number of rows:SELECT * FROM logs LIMIT 100;
Working with Multiple Tables
SELECT
can combine data from multiple tables using joins:
SELECT orders.id, customers.name, orders.total FROM orders JOIN customers ON orders.customer_id = customers.id;
Computed Columns
You can perform calculations or transformations in the SELECT
clause:
SELECT product_name, price, price * 0.9 AS discounted_price, UPPER(category) AS category_uppercase FROM products;
Subqueries
SELECT
statements can be nested within other queries:
SELECT department, employee_count FROM ( SELECT department, COUNT(*) as employee_count FROM employees GROUP BY department ) dept_counts WHERE employee_count > 10;